Conceptual Image Retrieval over the Wikipedia Corpus

نویسندگان

  • Adrian Popescu
  • Hervé Le Borgne
  • Pierre-Alain Moëllic
چکیده

Image retrieval in large-scale databases is currently based on a textual chains matching procedure, a technique that produces good results as long as the annotations associated to pictures are accurate and detailed enough. These conditions are not met for a large majority of image corpuses, such as the Wikipedia collection, and it is interesting to explore methods that go beyond chain matching. In this paper, we present our approach to image retrieval, tested in the ImageCLEF 2008 WikipediaMM. The approach is based on a query reformulation using concepts that are semantically related to those in the initial query. For each interesting entity in the query, we used Wikipedia and WordNet to extract and list of related concepts, which were further ranked in order to propose the most salient in priority. We also made a list of visual concepts which were used in order to re-rank the answers to queries that included, implicitly or explicitly, these visual concepts. The CEA submitted two automatic runs, one based on query reformulation only and one combining query reformulation and visual concepts, which were ranked 4 th and 2 nd using the MAP measure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conceptual Image Retrieval over a Large Scale Database

Image retrieval in large-scale databases is currently based on a textual chains matching procedure. However, this approach requires an accurate annotation of images, which is not the case on the Web. To tackle this issue, we propose a reformulation method that reduces the influence of noisy image annotations. We extract a ranked list of related concepts for terms in the query from WordNet and W...

متن کامل

Visual Reranking for Image Retrieval over the Wikipedia Corpus

This paper describes the approach we developed for the WikipediaMM task on 2009 [4], which builds on our last year contribution. The main novelties are the refinement of textual query expansion procedure and the introduction of a k-NN based visual reranking procedure. Our main purpose was to test whether combining textual and content based retrieval improves over purely textual search and the r...

متن کامل

Conceptual Hierarchical Clustering of Documents using Wikipedia knowledge

In this paper, we propose a novel method for conceptual hierarchical clustering of documents using knowledge extracted from Wikipedia. A robust and compact document representation is built in real-time using the Wikipedia API. The clustering process is hierarchical and creates cluster labels which are descriptive and important for the examined corpus. Experiments show that the proposed techniqu...

متن کامل

Query Refinement and User Relevance Feedback for Contextualized Image Retrieval

The motivation of this paper is to increase the user perceived precision of results of Content Based Information Retrieval (CBIR) systems with Query Refinement (QR), Visual Analysis (VA) and Relevance Feedback (RF) algorithms. The proposed algorithms were implemented as modules into K-Space CBIR system. The QR module discovers hypernyms for the given query from a free text corpus (Wikipedia) an...

متن کامل

Exploiting Cooccurrence on Corpus and Document Level for Fair Crosslanguage Retrieval

In this paper we describe the methodology, architecture and implementation of the information retrieval system we have developed for the Robust WSD Task at CLEF 2008. Our system is based on an extensive query preprocessing step for homogenisation of the corpus queries. The preprocessing of queries includes: firstly, an query expansion step based on Wordnet Synonsyms or an Associative Index, sec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008